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We study the structure of the language of binary cube-free words. Namely, we are interested in the 
cube-free words that cannot be infinitely extended preserving cube-freeness. We show the existence 
of such words with arbitrarily long finite extensions, both to one side and to both sides. 

1 Introduction 

The study of repetition-free words and languages remains quite popular in combinatorics of words: lots 
of interesting and challenging problems are still open. The most popular repetition-free binary languages 
are the cube-free language CF and the overlap-free language OF. The language CF is much bigger and 
has much more complicated structure. For example, the number of overlap-free binary words grows only 
polynomially with the length [8 ], while the language of cube-free words has exponential growth [3 ]. The 
most accurate bounds for the growth of OF is given in |5] and for the growth of CF in [13]. Further, 
there is essentially unique nontrivial morphism preserving OF IfTOl , while there are uniform morphisms 
of any length preserving CF [5 ]. The sets of two-sided infinite overlap-free and cube-free binary words 
also have quite different structure, see Ifl2l . 

Any repetition-free language can be viewed as a poset with respect to prefix, suffix, or factor order. 
In case of prefix [suffix] order, the diagram of such a poset is a tree; each node generates a subtree and 
is a common prefix [respectively, suffix] of its descendants. The following questions arise naturally. 
Does a given word generate finite or infinite subtree? Are the subtrees generated by two given words 
isomorphic? Can words generate arbitrarily large finite subtrees? For some power- free languages, the 
decidability of the first question was proved in [01 as a corollary of interesting structural properties. The 
third question for ternary square-free words constitutes Problem 1.10.9 of [1]. For all fcth power-free 
languages, it was shown in [2] that the subtree generated by any word has at least one leaf. Note that 
considering the factor order instead of the prefix or the suffix one, we get a more general acyclic graph 
instead of a tree, but still can ask the same questions about the structure of this graph. For the language 
OF, all these questions were answered in [1 1 , 14], but almost nothing is known about the same questions 



In this paper, we answer the third question for the language CF in the affirmative. Namely, we 
construct cube-free words that generate subtrees of any prescribed depth and then extend this result for 
the subgraphs of the diagram of factor order. 

2 Preliminaries 

Let us recall necessary notation and definitions. We consider finite and infinite words over the binary 
alphabet £ = {a,b}. If x is a letter, then x denotes the other letter. By default, "word" means a finite word. 



for CF. 
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Words are denoted by uppercase characters (to denote one-sided infinite words, we add the subcsript „ at 
the corresponding side). We write A for the empty word, and \W\ for the length of the word W. The letters 
of nonempty finite and right-infinite words are numbered from 1; thus, W = W(l)W(2) ■ ■ -W(\W\). The 
letters of left-infinite words are numbered by all nonnegative integers, starting from the right. 

We use standard definitions of factors, prefixes, and suffixes of a word. The factor W(i) ■ ■ - W(j) 
is written as W(i...j). A positive integer p < \W\ is a period of a word W if W(i) = W(i+p) for all 
i E {1, . . . , \W\— p}. The minimal period of W is denoted by per(W). The exponent of a word is the 
ratio between its length and its minimal period: exp(W) = |W|/per(lV). Words of exponent 2 and 3 
are called squares and cubes, respectively. The local exponent of a word is the number lexp(W) = 
sup{exp(V)|V is a factor of W}. Periodic words possess the interaction property expressed by the text- 
book Fine and Wilf theorem: if a word U has periods p and q, and \U\ > p + q — gcd(p,q), then U has 
the period gcd(p,q). 

A word W is /3-free [j8 + -free] if lexp(W) < j8 [respectively, lexp(W) < j3]. The 3-free words are 
called cube-free, and the 2 + -free words are overlap-free. The language of all cube-free [overlap-free] 
words over £ is denoted by CF [respectively, OF]. A morphism / : Z + — > E + avoids an exponent /3 if the 
condition lexp(f/) < j8 implies lexp (/(£/)) < j8 for any word U. The following theorem allowes one to 
check cube-freeness of a morphism over the binary alphabet. 

Theorem 1 ([9 ]). A morphism f : £ + — >■ £ + is cube-free if and only if the word 
f {aabbababbabbaabaababaabh) is cube-free. 

The Thue-Morse morphism 6 is defined over £ + by the rules 6(a) = ab, 6(b) = ba. The words 

r« = e», T% = e n (b) (n>o) 

are called Thue-Morse blocks or simply ^-blocks. From the definition it follows that T* +l = T*T*. 
Hence, the sequences {r„ a } and {r* } have "limits", which are right-infinite Thue-Morse words T£ and 
rjj, respectively. We also consider the reversal °T of T° . The factors of Thue-Morse words are Thue- 
Morse factors; the set of all these factors is denoted by TM. Note that any word in TM can be written as 
W=xQi ■■■Q n y, wherex,yE£U{A}, Qi,...,Q„ E {ba,ab}. It is known since Thue [ 15] that TM C OF. 

Let L C I* and W E L. Any word U E I* such that UW E L is called a left context of W in L. The 
word W is Ze/f maximal [left premaximal] if it has no nonempty left contexts [respectively, finitely many 
left contexts]. The level of the left premaximal word W is the length of its longest left context; thus, left 
maximal words are of level 0. The right counterparts of the above notions are defined in a symmetric 
way. We say that a word is maximal [premaximal] if it is both left and right maximal [respectively, 
premaximal]. The level of a premaximal word W is the pair (n,k) EN such that n and k are the length of 
the longest left context of W and the length of its longest right context, respectively. 

In particular, a word W E CF is maximal if by adding any of the two letters on the left or on the right 
we obtain a cube. The word aabaabaa is an example of such a word. 

The aim of this paper is to prove the following theorems: 

Theorem 2. In CF, there exist left premaximal words of any level n E No- 
Theorem 3. In CF, there exist premaximal words of any level (n,k) E N§. 

3 Construction of premaximal words 

Theorem[2]is proved by exhibiting a series of left premaximal words, containing words of any level. The 
series is constructed in two steps: 
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1. building an auxiliary series {W,,}^ such that each word W„ has, up to one easily handled exception, 
a unique left context of any length < n; 

2. completing the word W„ to a left premaximal word W n . 

If a word W £ CF has a unique left context of length n, say U, and two left contexts of length n+l, 
then we say that U is the fixed left context of W (see the picture below). 




Example 1. Let W = aabciciba. Since aW = aaa- ■ ■ , abW = (aba) 3 , but aabbW,babbW G CF, we see 
that the fixed left context of the word W equals abb. 

Now let us explain step 1. We build the series {W„}q inductively, one word per iteration, in a way 
that the fixed left context X n of the word W n is of length > n (we will discuss the mentioned exception at 
the moment of its appearance). We put Wq = aabaaba and note that the left-infinite word 



is cube-free. So, we require that each word W n satisfies the following properties: 
(Wl) W„ starts with Wo; 

(W2) any word £T(k. . . 1) is a left context of W n ; 

(W3) some word £ T(k. . . 1) with k > n is the fixed left context of W n , denoted by X n ; 

(W4) if \X n \ > n, then W n+ \ = W„ (trivial iterations). 

The basic idea for obtaining W n+ \ from W n at nontrivial iterations is to let 



where x is the letter "prohibited" at the (n+l)th iteration, i.e. xX n certainly is not a left context of W n+ \. 
Thus, the fixed left context of W n+ \ is longer than the one of W„ by definition. 

Remark 1. An attempt to build the series {W m }q directly by ([7]) fails because cubes will occur at the 
border of some words W n and xX n . For instance, let us construct the word W\. We have Wt, = Wo in view 
of (W4) and Example^ X3 = abb, and the context aabb should be forbidden in view of(W2), because 
°T(4 . . . 1) = babb. So, x = a and the word W3XX3 has the factor aaa. 

A way out from this situation is the following idea: we insert a special "buffer" word after each of 
three occurrences of W n in dT). This insertion allows us to avoid local cubes at the border. Below we use 
the following notation: 

- P' n = xX n , P n = xX„, where x is the letter, prohibited at the (n+l)th iteration; thus, P n € TM; 

- S n is the word inserted after W n at the («+l)fh iteration; 

- S' n = SqSy ■■■S„ is the factor of W„ + i between Wq and the nearest occurrence of P' n ; 



a 



T abaaba = ■ ■ ■ abbabaab baab abbWo 




W n xX n W n xX n W r 



(1) 
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In these terms, we have the following expressions for W n+ \ for any nontrivial iteration: 




(2a) 



(2b) 



The structure of the word W n +\ imposes the following restrictions on the words S n and S n+ \. 

(51) Since the word X n+ {W n+ \S n+ \ is a factor of W n +i, X n+ \ ends with X n , and X n W„ + \_x = (X n W n S n x) 3 
by (l2al ). the word S n+ i must start with x, which is the first letter of P n ; 

(52) Since the wordS n xX n is a factor of W n+ \, if X n starts with* [xxxx], then5„ ends withx [respectively, 
x]. (Recall that X n G TM is an overlap-free word, whence any other prefix of X n does not restrict 
the last letter of S n .) 

Thus, our first goal is to find the words S n satisfying (SI) and (S2) such that all words S' n are cube-free. 
In other words, we have to construct a cube-free right-infinite word S'^ = SoS\ ■ ■ ■ S n ■ ■ ■ . The following 
lemma is easy. 

Lemma 1. The letters £T(n) and £T(n—l) coincide if and only if n = m ■ 2 k for some odd integers m 



Remark 2. If the only left context of length n of the word W n begins with xx, then \X n \ > n, because the 
letter before xx is also fixed. Thus, by (W4) we have W n+ \ = W n (and then S n = X) for all values of n 
mentioned in LemmaUl For all other values ofn(n>3), the iterations will be nontrivial. 

While constructing the word S'^ we follow the next four rules: 

1. For all nontrivial iterations, S n G {T£,T£T£,T£,T£T£T?,T?,T{T£\x G £}; hence, S„ G TM. 

2. Whenever possible, we choose S n to be a 2-block or a product of 2-blocks. 

3. Otherwise, if S n ends with the block T*, we put S n+ i = T* or S n+ [ = TfT£ (or the same possibilities 
forS„ + 2 ifS M+ i =A). 

4. If S n ^ A and there is no resttiction (S2) on the last letter of S n , we add this restriction artificially. 
Namely, we fix the last letter of S n to be x if ends with x (or if 5„_2 ends with x while 



Taking rules 1-4 into account, we can prove, by case examination, the following lemma about the 
first and the last letters of the words S n . 

Lemma 2. (1) If S n ends with x, then either S n+ i ends with x, or S n+ i = X and S n+ 2 ends with x. 

(2) The first letter of a nonempty word S n coincides with the last one for all n, except for the cases when 

The construction of the word S'^, the correctness of which we will prove, is given by Tabled] Ac- 
cording to this table, rule 3 applies to S n if and only if P n starts with xxxx. Hence if the word P n has such 
a prefix, then F„_i (or P n -2 if the (n— l)th iteration is trivial) has no such prefix; as a result, the word 
S n -\ (respectively, S n -2) ends with a 2-block. 

Now consider the case P n = xxxx - ■ ■ in more details. Without loss of generality, let P n start with b. 
Then P n = babaab ■■■ . Since P' n = aabaab the word S n cannot end with a or with baab; thus, it cannot 
end with a 2-block and we should use rule 3. 



and k. 



Sn-i — X). 
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Table 1: the suffixes S n for 32 successive iterations starting from some number £ divisible by 32. The 
righthand [lefthand] part of the table applies if the current letter of t£ is equal [resp., not equal] to the 
previous one. Trivial iterations are omitted. 



Iteration no. 


Prohibitions 




in) 


Start 


End 


S n -i 
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T X 

l 2 


k+l 








k + 2 
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X 


tIti 
1 2 1 2 


k + 4 


X 


X 




k + 5 
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x, T 2 


T^T X 


k + 6 


X 


X 


rf 


£ + 8 


X 


X 


T x 


k+10 


X 


X 


r rx r rx 


k+12 


X 


X 


T x 
1 2 


£+13 


X 


x, T 2 


r rx r rx r rx 
1 2 1 2 1 \ 


£+14 


X 


X 


T{ 


£+16 


X 


X 


T* 


£+17 
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X 


n 


£+18 




X 


rf 


£ + 20 




X 




£ + 21 
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X 


rf 


£ + 22 
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n 


£ + 24 




X 


T x 


£ + 26 


X 


X 


1 2 


£ + 28 


X 


X 


r 4 x 


£ + 29 


X 


x, T 2 
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X 


X 


T X (T X T X ) 



Iteration no. 

(») 
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Sn-1 
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X 


X 


rf 


£+1 


X 


X 
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£ + 2 


XXX 


X 


n 
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X 
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1 2 
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1 2 
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X 


JX 
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£ + 26 
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T x 
1 2 
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X 


X 


T x 
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X 
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1 2 1 2 *l 


£ + 30 


X 


X 


rf 



Since P n is a factor of £T while is an infinite product of the blocks T 2 a = abba and T 2 = baab, 
one of the blocks T 2 ends in the second position of P n . First consider the following occurrence of P n in 



I 2 I 2 



= ■•■abba abba baab baab--- (3) 

Pn 



Since P' n _\ = bbaab-- ■ , the word S n -\ ends with abba. Therefore, we cannot put S n = ab (otherwise 
S n will have the suffix baab). Further, P n -\ starts with abaab, whence the first letter of S n is a by (SI). 
Hence, according to rule 1, the only possibility for S n is T 2 a T 2 T" = abbabaabab. It is easy to see that 
S n +\ = ba satisfies both (SI) and (S2). 
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If the last embraced 2-block of © is T£, not T%, then we have, up to renaming the letters, the same 
case as below: 

T$ T$ T$ 

®T = ■ ■ ■ baab ab ba baab ■ ■ - f 

Pn 

We assign, as above, S n = TgT^Tf and S n+ \ = Tf. The problem appears on the (n+5)th iteration, 
because 

P'n+A = ^b^bab^bab^aab ■ ■■ , 

i.e., S n+ 4 cannot end with ba or ab. Here we have an exclusion from the general method. We use the 
following trick. At the next three iterations ((n+5)th to («+7)th, the last of them being trivial) we 
have to add the prefix baa to the fixed context. We will do this prohibiting 3-letter contexts instead of 
single letters. The word P M+ 3 = babbaba ■ ■ • has three left contexts of length 3: aab, baa, and bba. We 
will prohibit bba on the (n+5)th iteration and aab on the (n+6)th one. To do this, we deliberately put 
Pn+4 = bbababbabaab ■ ■ ■ , P„ , 5 = aabbabbabaab ■■ ■ . This allows us to choose S n+ 4 = ba,S n+ $ = ab. 
Remark 3. The above trick leads to one local violation of the general rule on X n . Namely, \X n+ 5 \ = n+4 
(this word coincides with X n+ $). The situation is corrected on the next iteration, when we get \X n \(\ = 
7i+7 (and the (n+7)th iteration is trivial). 

Remark 4. The word T^T^T^T^Tf = 6 2 (aabaa) is not a factor of^T. Hence, the factor T^T^T^ occurs 
in £T inside the factor T^T^T^T^ or T^T^T^T^. Each such factor requires two uses of the above trick 
with 3-letter contexts. 

Let us consider the 108-uniform morphism y/ : £* — >■ £*, defined by the rules 

yr(a) = TfT£TfT?TjT$l%TZT$lfT£Tt7%TjT£, (4a) 
y(b) = T%T%T$T%TgT2T%T2T%T%T$Tl[T$TZT%. (4b) 

Note that the words y{b) and y(a) coincide up to renaming the letters. A computer check shows that the 
word \\f(aabbababbabbaabaababaabh) is cube-free. Hence by Theorem [TJ y is a cube-free morphism 
and the word yf(T£) is cube-free. So we put S'^ = \\r(T^). The i//~-image of one letter equals the product 
S n -\S n ■ ••S„_|_3o for some number n divisible by 32, see Table [T] The only exception is described below. 
Thus, such a i//~-image corresponds to 32 successive iterations, during which a 5-block is added to the 
fixed left context X n _i to get X n+ 3\. 

There are two different factorizations of the i//-image of a letter, depending on the positions of the 
factors T^T^T^TQ and TfT^T^TJ? inside and on the borders of the current 5-block of "T. These fac- 
torizations are presented in the two parts of Table Q] The mentioned factors occur in the middle of 
(2^+1) -blocks for each k > 2. Thus, these factors occur in the middle of each 5-block, and also at the 
border of two equal 5-blocks. For the latter case, the factorization of the i//-image of the second of two 
equal letters is given in the righthand part of Table Q] In the lefthand part of Table [Q there are two 
possibilities for S n+ 29- the longer [shorter] one should be used if the next 5-block is equal [respectively, 
not equal] to the current one. In the first case, S n+ 29 consists of the last two letters of the i//~-image of the 
current letter and first four letters of the y/-image of the next letter. In the second case, S n+ 29 consists 
exactly of the two last letters of the y/-image. 

The first several iterations are special. Namely, for the regularity of general scheme, we artificially 
put W3 = WoS-iSi (the 1st and the 3rd iterations are trivial by the general condition). 

Thus, we defined the words S n and then the words W n for all positive integers n. The correctness of 
the construction is based on the following lemma. 
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Lemma 3. The word X n W n is cube-free for all n G No- 

Proof. We prove by induction that all the words V„ = (X„W n S„x n ) , where x n is the letter forbidden on 
(n+l)th iteration, have no proper factors that are cubes. This fact immediately implies the statement of 
the lemma. The inductive base n < 4 can be easily checked by hand or by computer. Let us prove the 
inductive step.The structure of the word V n is illustrated by the following picture. 



Pn Pn Pn 




V n = | Xn | W n | S„ \x n \ X n | W n | Sn X n | Wn | S n pC n | 

Assume to the contrary that the word V n , n > 5, contains some cube U . Of course, it is enough to 
consider the case when the (n+l)th iteration is nontrivial. The factor U 3 of V n has periods q = \U\ and 
Pn = | V„ | / 3 , but obviously does not satisfy the interaction property. Hence, \U 3 \ = 3q < q + p n — 2 by 
the Fine and Wilf theorem, yielding q < p n /2 — 1. On the other hand, by definition of W n , the longest 
proper suffix of the word X n W n coincides with the longest proper prefix of V n -i- If U 3 contains this 
prefix, then the latter has periods q and p n -\ = \V n -\\/3. Applying the Fine and Wilf theorem again, 
we get p n -\ < q/2 — 1. Excluding q from the two obtained inequalities, we get p n > Ap n -\ + 3. But 
Pn = \V n -\ | + \S n \ + 1 < 3p„-\ + 17. Thus, p n -\ < 14. For n > 5, this is not the case. So, we conclude 
that U 3 does not contain the word X n W n . 

Claim 1. The word S' n occurs in V n only three times. 

Proof. Recall that S' n is a product of 2-blocks (possibly except the last "odd" 1-block), and if n > 5, 
then S' n begins with a 4-block. Hence, S' n has no factor Wq and, moreover, cannot begin inside Wo. 
Furthermore, it can be checked by hand or by computer that S'^ has no Thue-Morse factors of length 
>48. Now looking at the structure of S' n and of V n one can conclude that any "irregular" occurrence of 
S' n in V n should be a prefix of some word SjLP/Wb, where k < n. The word S' k is a proper prefix of 5^. 
The word P' k is obtained from a Thue-Morse factor by changing the first letter, and hence never begins 
with a 2-block. Hence, the only possibility is k = n— 1, and S n should be the 1-block coinciding with the 
prefix of P' k . By Table [Q in all cases when S n is a 1-block, P' n _ x begins with the square of letter, so this 
possibility cannot take place. □ 

Claim 2. The word X n W n S n x n is cube-free. 

Proof. The word X n W n is a factor of V„_ i and hence is cube-free by the inductive assumption. Using 
again the fact that S' n is "almost" a product of 2-blocks, we conclude that S' n x n is also cube-free. So, 
a cube in X n W n S n x n , if any, contains inside the suffix S' n _ l of the word W n . This suffix is preceded by 
Wb = aabaaba; the latter word breaks all periods of S' n _ x and does not produce a cube. Hence, the cube 
should contain more than one occurrence of the factor S' n _ { . Applying Claim 1 to the words S' n _ l and 
V n -\, we see that the cube has the period p n -\ = (\X n W n \ + l)/3. But this is impossible by condition (SI). 
The claim is proved. □ 

Combining Claim 2 with the fact that U 3 has no factor X n W„, we get that U 3 is contained inside the 
word X n W n S n x n X n W n . Furthermore, if S' n is a factor of U 3 , then the middle occurrence of U is inside S' n 
(otherwise, U 3 contains one more occurrence of S' n , contradicting Claim 1). In this case, the positions of 
all factors aa and bb in U have the same parity. But the rightmost occurrence of U in U 3 contains a suffix 
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of S'„ followed by a prefix of the word x n X n = P' n . The letter x n breaks this parity of positions, which is 
impossible. The cases in which all the positions of act and bb in the rightmost occurrence of U are on the 
same side of the letter x n , can be easily checked by hand. Thus, we obtain that S' n is not a factor of U . 
Thus, U 3 begins inside the factor S' n x n . 

Where the word U 3 ends? It is easy to see that the word 

X n W n = x n -\X n -\W n -\S n -\x n -\X n -\W n -iS n -\x n -\X n -\W n -\S n -\ 

has the same three occurrences of the factor S' n _ l as V„-i. So, if U 3 contains S' n _ x , then the middle 
occurrence of U is inside S' n _ { . But this is impossible because S' n _ y is a rather short suffix of W n -\ and 
the whole word X n W n is cube-free. Therefore, U 3 should end inside the prefix x n -\X n -\W n -\S n -\ of 
X n W„, like in the following picture. 

U U U 
I I I I p n _i 

^n— 1 -*n— *Jn— 1 u n I 

Using the same parity argument as above, we conclude that the word S' n x n X n = S' n P' n is cube-free and, 
moreover, U 3 should contain the prefix aabaa of the word W n - 1 ■ Two cases are to be considered: either 
aabaa is a factor of U or aabaa occurs in U 3 only twice, on the borders of consecutive t/'s. The 
second case is impossible, because two closest occurrences of aabaa in W n - 1 are separated by the factor 
babaababbaabbabaabaabb which does not contain P' n as a suffix. For the first case, we get that some 
(not the leftmost) occurrence of aabaa in U 3 is preceded by the concatenation of some suffix of S' n and 
the word P' n . If this occurrence of aabaa is a prefix of some Wo, then it is preceded by some P' k , k < n. 
But P' k is not a suffix of P' n , a contradiction. The remaining position for this occurrence of aabaa is the 
border of some words S' k and P' k . But then S' k contains the factor which is on the border between S' n and 
P' n , and the parity argument shows that S' k cannot be partitioned into 2-blocks. This final contradiction 
shows that U 3 cannot be a factor of V n . The lemma is proved. □ 



By construction, the word X„ is the fixed left extension of W n . Now we consider the second step, that 
is, the completion of such "almost uniquely" extendable word W„ to a premaximal word. The main idea 
is the same as at the first step. In order to obtain a premaximal word of level n, we build the word W n+ \ 
in n+l iterations by scheme (Hal ) and then prohibit the extension of W n+ \ by the first letter of the word 
P n . We denote the obtained premaximal word of level n by W„. Then 

W n = J^^££^™^' (5) 

where S n is a "buffer" inserted similarly to S„ in order to avoid cubes at the border of the occurrences of 
W n+ \ and P n . In contrast to the first step, we do not need to build a cube-free right-infinite word, because 
the construction © is used only once. The form of the word S n depends on the last iteration according 
to Table [H this dependence is described in Table |2l We choose S n to be the left extension of the word P n 
within £T (recall that P n =£T(n+\ ... 1)). 

The above idea works without additional gadgets in all cases when \X n \ = n. Due to the following 
obvious remark, it is enough to construct left premaximal words of level n for all n such that \X n \ = n; 
hence, we do not consider constructing the words W n for other values of n. 
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Table 2: the "final" suffixes S„ for the correspondin 
the number of the last iteration. 



Iteration no. 


Prohibitions 




(n) 


(Start) 


S n -l 


k 






k + l 


X 


XX 


£ + 3 


X 


X 


£ + 4 


X 


X 


£ + 5 


X 




k + l 


X 


XX 


k+9 


X 


XX 


k+n 


X 


X 


k+12 


X 


X 


£+13 


X 


X 


£+15 


X 


X 


£+16 


X 


X 


£+18 


XXX 


XX 


£+19 






£ + 20 


X 


X 


£ + 23 




XX 


£ + 25 


X 


XX 


£ + 27 


X 


X 


£ + 28 


X 


X 


£ + 29 


X 


X 


£ + 31 


X 


X 



iterations from Table [TJ The first column contains 



Iteration no. 


Prohibitions 




(n) 


(Start) 


S n -\ 


£ 


X 


X 


£+1 






£ + 3 


XXX 


X 


£ + 4 


X 


X 


£ + 5 






£ + 7 


XXX 


XX 


£ + 9 


X 


XX 


£+11 


X 


X 


£+12 


X 


X 


£+13 


X 


X 


£+15 


X 


X 


£+16 


X 


X 


£+18 






£+19 


XXX 


X 


£ + 20 


X 


X 


£ + 23 




XX 


£ + 25 


X 


XX 


£ + 27 


X 


X 


£ + 28 


X 


X 


£ + 29 


X 


X 


£ + 31 


X 


XX 



Remark 5. In order to prove the Theorem |2] it is sufficient to show the existence of left premaximal 
words of level nfor infinitely many different values ofn. Indeed, if a word W is left premaximal of level 
n and a\ ■ ■ -a n W is a left maximal word, then the word a n W is left premaximal of level n—\. 

Using the facts that W n+ \ € CF, S n P n G TM, and the suffix S' n of W n+ \ has no long Thue-Morse factors 
(this is the property of any i//-image), we prove the following lemma. The proof resembles the one of 
Lemma [3] 

Lemma 4. The wordX n W n is cube-free for all n G No- 

Since the word P n W„ is a cube by © and at the same time P„ = X n+ \ is the fixed left context of W n+ u 
we conclude that X n is the longest left context of the word W n . Theorem|2]is proved. 

Remark 6. For any n, the word rev (W„ ) = W n ( \W n |) • • ■ W n ( 1 ) is right premaximal of level n. 

Remark 7. Our construction provides an upper bound for the length of the shortest left premaximal 
word of any given level n. The results of 'Ml/ suggest that this length is exponential in n. Let l(n) = \W„\. 
For nontrivial iterations, we have l(n) = 3l(n—l) + 0(n). It is well known that two successive letters 
in the Thue-Morse word are equal with probability 1/3. Thus, to obtain W n , we make approximately 
2n/3 nontrivial iterations. So, l(n) is exponential at base 3 2 / 3 ~ 2.08. The same property holds for 
\W n \ = 3/(n+l) + 0(n). It is interesting whether this asymptotics is the best possible. 
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Sketch of the proof of Theorem\3\ Similar to Remark[5j it is enough to build premaximal words of level 
(«;,«;) for some infinite sequence n\ < ni < . . . < tii < . . . of positive integers. We take = 32/ + 3 
(Table |2] indicates that S m = X, which makes the construction easier). The natural idea is to concatenate 
left premaximal and right premaximal words through some "buffer" word. But we cannot use the words 
W n for this purpose, because all words X n W n appear to be right maximal. 

So, we modify the last step in constructing left premaximal words as follows. The proof of Lemma[3] 
implies that the word X n W n S n • • • S n+ i is cube-free for any I. So, we put 

W ni = W n .+lS ni +lS n . + 2Pn i W ni+ iS ni+ iS ni+ 2Pn i W ni+ iS„ i+ iS ni +2 ■ 
* » /v . ' v . ' 

By Table[T] S„ i+ 3 = X and 5„ j+ 4(l) ^ S„ i+ i(l) = x. The proof of the fact that X Hj W nj € CF reproduces 
the proof of Lemma 0] Recall that 5 n .+i(l) = P nt {^) by (SI), yielding that this letter breaks the period 
of W n .+i (see d2bl). On the other hand, the letter x breaks the global period of the word W, v . Hence, 
the condition X ni+1 W ni+ iS„ i+1 ■■■S„ i+ i G CF implies X nj W ni S„ i+i ■ ■ -S„ i+ i G CF for any I. Thus, W m is 
infinitely extendable to the right, left premaximal word of level rij. 

Choose an even m such that \X n jW n , \ < 2 m ~ 2 and consider the word W, VMi = W n T^re\ (W n ): 

% revfe) 
I I I I 

W nm = | g| S'„ i+2 | Tj |rev(^, +2 )| i 

It remains to prove that the word X„.W r „ ij „.rev(X„.) is cube-free. By the choice of m and overlap- 
freeness of T*, no cube can contain the factor T£. So, by symmetry, it is enough to check that the 
word U = X, h W nj T* x is cube-free. Assume to the contrary that it contains a cube YYY . Recall that the 
word X, h W nj is cube-free. Since the first letter of T£ breaks the period of X ni W n , one has \Y\ < per(W nj ). 
Consider the rightmost factor aabaa in U; it is inside the factor Wo immediately before the suffix S' n . +2 
of W n . If this factor belongs to YYY, then \Y\ symbols to the left we have another aabaa, followed 
by S' n . +2 . Then \Y\ = per(W'„ i ), a contradiction. Hence, YYY has no factors aabaa, i.e., is a factor of 
abaabaS' n . +2 T* v One can check that the word S' n . +2 contains no Thue-Morse factors of length > 48. The 
shorter factors can be checked by brute force. 

Thus, the word W HjMj is premaximal of level («,■,«,)■ The theorem is proved. □ 
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